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Several variants of a stochastic local search process for constructing the synaptic 
weights of an Ising perceptron are studied. In this process, binary patterns are 
sequentially presented to the Ising perceptron and are then learned as the synaptic 
weight configuration is modified through a chain of single- or double-weight flips 
within the compatible weight configuration space of the earlier learned patterns. 
This process is able to reach a storage capacity of a ~ 0.63 for pattern length 
N = 101 and a ~ 0.41 for N = 1001. If in addition a relearning process is exploited, 
the learning performance is further improved to a storage capacity of a ~ 0.80 for 
./V = 101 and a ~ 0.42 for N = 1001. We found that, for a given learning task, 
the solutions constructed by the random walk learning process are separated by 
a typical Hamming distance, which decreases with the constraint density a of the 
learning task; at a fixed value of a, the width of the Hamming distance distributions 
decreases with N. 
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I. INTRODUCTION 

A single-layered feed-forward network of neurons, referred to as a perceptron, is an ele- 
mentary building block of complex neural networks. It is also one of the basic structures for 
learning and memory [l[. In a perceptron, N input neurons (units) are connected to a single 
output unit by synapses of continuous or discrete- valued synaptic weights. The learning task 
is to set the weight values for these N synapses such that an extensive number M = aN of 
input patterns are correctly classified (see Fig. |T^l) . The parameter a = M/N is called the 
constraint density. An assignment of these weights is referred to as a solution if the percep- 
tron correctly classifies all the input patterns with this weight assignment. Compared with 
perceptrons with real- valued synaptic weights, Ising perceptrons, whose synaptic weights are 
binary, are much simpler for large-scale electronic implementations and more robust against 
noise. An Ising perceptron is also relevant in real neural systems, as the synaptic weight 
between two neurons actually takes bounded values and has a limited number of synaptic 
states^, [sj] . On the other hand, training a real- valued perceptron is easy (e.g., the Minover 
algorithm j3| and the AdaTron algorithm jHJ) but training an Ising perceptron is known to 
be an NP-complete problem (6j . Given aN input patterns, the computation time needed to 
find a solution may grow exponentially with the number of weights N in the worst case. A 
complete enumeration of all possible weight states is only feasible for small systems up to 
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N = 25 [7H10] • In recent years researches on efficient heuristic algorithms were rather active 

If the number M of input patterns is too large, a perceptron will be unable to correctly 
classify all of them, no matter how the synaptic weights are modified. This is a phase 
transition phenomenon of the solution space of the perceptron. In the case that the M input 
binary patterns are sampled uniformly and randomly from the set of all binary patterns, the 
maximal value a s of the constraint density a, the storage capacity at which a solution still 
exists, has been calculated by statistical physics methods. For the continuous perceptron 



subject to the spherical constraint, Gardner and Derrida found that a s = 2 [181 . [19] . At 
the thermodynamic limit of N — > oo, the continuous perceptron is impossible to correctly 
classify more than 2N random input patterns. When the synaptic weight is restricted 
to binary values, a s was predicted to be 0.83 by Krauth and Mezard using the first-step 
replica-symmetry broken spin-glass theory [2(|. This prediction was confirmed by numerical 
simulations of small size systems (plus an extrapolation to large N) fj\, I9I l2l| . 

The theoretically predicted storage capacity a s represents the upper limit of achievable 
constraint density a by any learning strategies. As the constraint density a increases, it is 
expected that the solution space of the Ising perceptron breaks into a huge number of disjoint 
ergodic components j22|. Solutions from different components are significantly different. One 
can define a connected component of the weight space as a cluster of solutions in which any 
two solutions are connected by a path of successive single- weight flips 0, 24j . These solution 



clusters are separated by weight configurations that only correctly classify a subset of the 
input patterns. These partial solutions act as dynamical traps for local search algorithms 
and make the learning task hard. An adaptive genetic algorithm was suggested by Kohler in 
1990, which could reach a ~ 0.7 for systems of N = 255 [Ufl. Simulated annealing techniques 



were used by Horner [22| but critical slowing down of the search process was observed, due 
to the very rugged energy landscape of the problem. The simulated annealing was also 
used to study the statistical structure of the energy landscape for the Ising perceptron. 
The analysis of the distribution of distances between global minima obtained by simulated 
annealing for small a indicated that the distance distribution becomes a delta function in 



the thermodynamic limit 25] . Making use of the advantage that efficient algorithms exist 



for the real-valued perceptron, an alternative approach was to clip the trained real-valued 
weights of the continuous perceptron into binary values [H, 26-29[. Not all synaptic weights 



can be correctly specified by clipping, however, and for those uncertain weights, complete 
enumeration was then adopted. A message-passing; algorithm was developed by Braunstein 
and Zecchina for the Ising perceptron [l5j, which was able to reach a ~ 0.7 for N > 1000. 
The efficiency of this belief-propagation algorithm was later conjectured to be due to the 
existence of a sub-exponential number of large solution clusters in the weight space 24|. An 
on-line learning algorithm inspired from this belief-propagation algorithm was also studied 



16] . in which hidden discrete internal states are added to the synaptic weights. 

In real neural systems, the microscopic mechanism of perceptronal learning is the Hebbian 
rule of synaptic modification (spiking-time-dependent synaptic plasticity may be exploited, 
see, e.g., Refs. Ho], IHI). The learning processes in biological perceptronal systems are ex- 



pected to be much simpler than the various sophisticated learning processes of artificial 
perceptrons. Two other important aspects of biological perceptron systems are (i) the pat- 
terns to be classified are usually read into the system in a sequential order, so they are 
being learned one by one, and (ii) when a new pattern is being learned, there are biological 
mechanisms which reactivate old learned patterns; such recalling processes help to prevent 
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FIG. 1: (Color online) The sketch of the Ising perceptron and the single-weight random walking 
process in the corresponding weight space, (a) N input units (open circles) feed directly to a single 
output unit (solid circle). A binary input pattern . . . of length N is mapped through 

a sign function to a binary output cr M , i.e., cr M = sgn(J^^ 1 The set of iV binary synaptic 

weights {Ji} is regarded as a solution of the perceptron problem if the output cr M = <Tq for each 
of the M = aN input patterns fi £ [1,M], where cig is a preset binary value, (b) A solution 
space random walking path (indicated by arrows). An open circle represents a configuration that 
satisfies the first m + 1 input patterns, while a black circle and a gray circle represents, respectively, 
a configuration that satisfies the first m and the first m — l input patterns. An edge between two 
configurations means that these two configurations are related by a single- weight flip. 



old patterns from being forgot as new patterns are learned (see, e.g., the experimental in- 
vestigation of Ref. [32}). Motivated by these biological considerations, we investigate in this 
paper a simple sequential learning mechanism, namely synaptic- weight space random walk- 
ing. In this random walking mechanism, the aN patterns are introduced into the system in 
a randomly permuted sequential order, and random walk of single- or double-weight flips is 
performed until each newly added pattern is correctly classified (learned). The previously 
learned patterns are not allowed to be misclassified in later stages of the learning process. 
We perform extensive numerical simulations on several variants of this simple sequential lo- 
cal learning rule and find that this mechanism has good performance on systems of N ~ 10 3 
neurons or less. 

The paper is organized as follows. The Ising perceptron learning is denned in more detail 
in Sec. [TXJ Several strategies of learning by random walks are presented in Sec. IIHI In 
Sec. IIVt experimental study of learning algorithms is carried out. The overlap distribution 
of solutions as well as performances of different local search algorithms is reported. Summary 
and discussion are given in Sec. |V] 

Sequential random walk search algorithms were recently investigated in various combina- 
torial satisfaction problems (see, e.g., Refs. 33N35|). The present work adds evidence that 
the solution space random walking mechanism, although very simple and easy to implement, 
is able to solve many nontrivial problem instances of a given complex learning or constraint 
satisfaction problem. 
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II. THE RANDOM CLASSIFICATION PROBLEM 

For the Ising perceptron depicted schematically in Fig. [T^,, N input units are connected 
to a single output unit by N synapses of weight Jj = ±1 (i = 1, 2, . . . , N). The perceptron 
tries to learn M = aN associations } (/x = 1,2,..., M), where £ M = . . , Cat) 

is an input pattern with £f = ±1, and (Tq = ±1 is the desired classification of the input 
pattern \i. Given the input pattern the actual output <r M of the perceptron is 



/v 



a M = sgn 



t=i 



The perceptron can modify its synaptic weight configuration {Jj} = ( J l7 J 2 , . . . , Jjv) to 
achieve complete classification, i.e., er M = o"q for each of the M input pattern. The solution 
space of the Ising perceptron is composed of all the weight configurations { Jj} that satisfy 
<E,^f >0for/i = l,2,...,M. 

For the random Ising perceptron problem studied in this paper, each of the M input 
binary patterns £ M is sampled uniformly and randomly from the set of all 2 N binary patterns 
of length N, and the classification o"q is equal to ±1 with equal probability. For N sufficiently 
large, the solution space of such a model system is non-empty as long as a < 0.83 j20|. To 
construct such a solution configuration {Jj}, however, is quite a non-trivial task. 

A more stringent learning problem is to find a weight configuration {Jj} such that, for 
each input pattern 

V J ^ 



where k > is a preset parameter [20J. The most efficient way of solvin g th is constraint 



satisfaction problem appears to be the message-passing algorithm of Refs. 

One can perform a gauge transform of £f —> Ci^o to each input pattern. Under this 
gauge transform, each desired output is transformed to ctq = 1. Without loss of generality, 
in the remaining part of this paper we will assume (Tq = 1 for any input pattern \i. Consider 
the case of N being odd, we define the stability field of a pattern \i as 



A' 



^ = E ■ ( 3 ) 



i=l 



To ensure the local stability of input pattern \i under changes of weight configuration { Jj}, 
in analogy to Eq. ([21 , we introduce a stability parameter A > 1 and require that h} 1 > A for 
each /i. Input patterns with > 3 are stable against a single-weight flip. For the single- 
weight flipping processes of the next section, the input patterns with h} 1 = 1 are referred 
to as barely learned patterns, as these patterns may become misclassified after the weight 
configuration makes a single flip. Similarly, for the double-weight flipping process of the 
next section, the input patterns with h} 1 = 1 or /i M = 3 are referred to as barely learned 
patterns. 



III. LEARNING BY RANDOM WALKS 



Random walk processes were used in a series of works |33|, |34|, to find solutions for 

constraint satisfaction problems. They were also used as tools to study the solution space 
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structure of these constraint satisfaction problems [33|, |34j, |40( . Various local search strategies 



have been developed to improve the performance of random walk stochastic searching [41 
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The random walk learning strategies of this work follow the SEQSAT algorithm of Ref. [34 

An initial weight configuration J% \ • • • > jff) * s randomly generated at time t = 0. The 
first pattern £ 2 is applied to the Ising perceptron. If this pattern is correctly classified under 
the initial weight configuration (i.e., h 1 > 0), then the second pattern £ 2 is applied; otherwise 
the weight configuration is adjusted by a sequence of elementary local changes until £ x is 
correctly classified. The algorithm then proceeds with the second pattern £ 2 , the third 
pattern £ 3 , etc., in a sequential order. An elementary local change of weight configuration 
is achieved either by a single- weight flip (SWF) or by a double- weight flip (DWF). 

Suppose at time t the weight configuration is {J®} = (j[ , ■ ■ ■ , Jn)i anc ^ suppose 
this configuration correctly classifies the first m input patterns = 1, ... , m) but not the 
(m + l)-th pattern £ m+1 . The configuration { Jj} will keep wandering in the solution space 
of the first m patterns until a configuration that correctly classifies £ m+1 is reached (see 
Fig. [Tb). In the SWF protocol, a set A(t) of allowed single- weight flips is constructed based 
on the current configuration {J®} and the m learned patterns. A(t) includes all integer 
positions j G [1,-AT] with the property that the single-weight flip of J- — > —Jj does not 
render any barely learned patterns \x G (whose = 1) being misclassified. At time 

t' — t + 1/N an integer position j is chosen uniformly and randomly from set A(t) and the 
weight configuration is changed to {J^} such that jf ■* = jf' if % ^ j and J-* ' = —Jj- It 
is obvious that the new configuration {J^'^} also satisfies all the first m patterns. 

The DWF protocol is very similar to the SWF protocol, with the only difference that the 
allowed set A(t) at time t contains ordered pairs of integer positions with i < j. This 
set of ordered pairs can also be easily constructed. If, with respect to configuration {J^}, 
there are no barely learned patterns (whose stability field h} 1 = 1 or 3) among the first m 
learned patterns, then A(t) contains all the N(N — l)/2 ordered pairs of integers with 
1 < i < j ^ N. Otherwise, randomly choose a barely learned pattern, say mi G [1 , m] , 
and for each integer i G [1,N] with the property that jf 1 ^™ 1 < 0, do the following: (1) if 
< for all the other barely learned patterns, then add all the ordered pairs with 
j G [i + 1,N] into the set A(t); (2) otherwise, add all the ordered pairs into the set 
A(t), with the property that the integer j G [i + 1, N] satisfies Jj^Cj < for all those barely 

learned patterns fi G [l,m] with J 4 > 0. 

The waiting time At m+1 of satisfying the (m + l)-th pattern is defined as the total elapsed 
time from first satisfying the m-th pattern to first satisfying the (m + l)-th pattern. And the 
total time T m+1 of satisfying the first (m+1) patterns is simply T m+1 = J2™=i At p. One time 
unit corresponds to N elementary local changes of the weight configuration. The random 
walk searching process stops if all the M input patterns have been correctly classified, or if 
the last visited weight configuration becomes an isolated point (i.e., the set A(t) becomes 
empty after a new pattern is included into the set of learned patterns), or if the last waiting 
time At m+ i exceeds a preset maximal time value At max , which is equal to At max = 1000 in 
the present work. 

The SWF and DWF random walks processes as mentioned above are very simple to 
implement and they do not overcome any barriers in the energy landscape of the perceptron 
learning problem. However, as we demonstrate in the next section, their performances are 
quite remarkable for problem instances with pattern length iV < 10 3 . 
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The SWF process, as a local search algorithm, will get stuck in one of the enormous 
metastable states when all the weights become frozen (here we identify a synaptic weight as 
being frozen if flipping its value causes at least one of the learned patterns to be misclassified) , 
at a constraint density value much smaller than the theoretical threshold value of 0.83. The 
DWF process will also get jammed if the weight configuration becomes frozen with respect 
to any double-weight flips. To further improve the achievable storage capacity for the SWF 
and DWF learning processes, a simple relearning strategy is added to the random walk 
searching. The basic idea of the relearning strategy is: if some learned patterns are hindering 
the learning of new patterns very much, we first ignore them and proceed to learn a number 
of new patterns; after that, we learn the ignored patterns again and hope they can all be 
correctly classified. 

In the present work, we implement the relearning strategy in the following way. Suppose 
that as the m-th input pattern is presented to the Ising perceptron, the SWF or the DWF 
process is unable to learn it in a waiting time At m < At max . We then remove all the k 
barely learned patterns \i £ [l,m — 1] with = 1 from the list of learned patterns, and 
proceed to learn the patterns \i £ [m, m + k — 1] in a sequential manner (stage 1). If the SWF 
process or the DWF process succeeds in learning these k patterns, we then return to learn 
the k previously removed patterns again in a sequential manner (stage 2). If this relearning 
succeeds, we proceed with the patterns with index fi > m + k. If this attempt fails either at 
stage 1 or at stage 2, we stop the whole random walk learning process or start with another 
trial by removing all the learned patterns. In practice, we find that the relearning process 
has a high probability to succeed in both stage 1 and stage 2 if a is not too large and pattern 
length is of order 10 3 or less. 



IV. RESULTS 

Figure [2] demonstrates the simulation results for several random walk learning strategies. 
For each learning strategy, M set of random input patterns (£\£ 2 , . . . ,£ M ) are generated. 
Each input pattern ^ has length N. The random walk learning strategy is then applied 
to each set of patterns until it stops, at which point we record the number of correctly 
classified patterns m and calculate the achieved storage capacity a = m/N. The mean 
values of a are reported in Fig. [2j It appears that the storage capacity of all the four 
learning strategies decreases with iV roughly as a power law a oc A r_7 . At each value of N, 
the SWF strategy has the worst performance, while the DWF strategy with relearning has 
the best performance. 

The SWF strategy is able to reach a storage capacity of a ~ 0.36 for systems of N = 101 
and a ~ 0.17 for systems of iV = 1001. These values are much less than the theoretical 
storage capacity of a ~ 0.83. However, the DWF strategy performs much better, with a 
capacity of a ~ 0.63 for N = 101 and a ~ 0.41 for N = 1001. In real neural systems, 
perceptronal learning of elementary patterns probably does not involve too many neuronal 
cells and a value of N ~ 10 2 might be common. For perceptronal systems with N ~ 10 2 — 10 3 , 
the SWF and DWF strategies can be regarded as efficient. 

If relearning is introduced into the random walk learning strategies, the performance can 
be further improved. For the DWF strategy with relearning, we find that the storage capacity 
is a ~ 0.80 for iV = 101 and a ~ 0.42 for N = 1001. Relearning is indeed a biologically 



relevant strategy in perceptronal learning of real neural systems [32|, |44|. As a comparison, 



for problem instances of pattern length N = 1001, the belief-propagation inspired learning 
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FIG. 2: (Color online) Comparison of the performances of several random walk search strategies. 
The achieved storage capacity a as averaged over many independent runs (100 for the smallest N 
and 10 for the largest N) are shown as a function of the pattern length N. The solid lines are 
power-law fittings of the form a oc N~ 7 , with 7 = 0.302,0.347,0.198,0.241 for SWF, SWF with 
relearning, DWF and DWF with relearning, respectively. 



strategy of Baldassi and coauthors [16( achieves a ~ 0.47 when the number K of internal 
states of their algorithm is set to K = 40. This storage capacity a decreases to a ~ 0.36 at 
K = 20 and to a « 0.10 at K = 10. 

For the same set of input patterns (£ 1 , £ 2 , . . . , £ m ), different runs of the SWF strategy or 
the DWF strategy lead to different solution configurations. The similarity between solutions 
can be measured by an overlap value q as denned by 



*=4E J ^> ( 4 ) 



i=i 



where (Ji, . . . , Jn) and (J{, . . . , J' N ) are two solutions. The reduced Hamming distance dn 
between two solutions is related to the overlap q by dn = (1 — s)/2. The typical value of 
the overlap value at constraint density a ~ 0.83 is predicted to be q m 0.56 according to 
the replica-symmetric calculation j20[, suggesting that solutions are still far away from each 
other (with a reduced Hamming distance dn ~ 0.22) as a approaches the theoretical storage 
capacity a s . 

Figure [3] shows the histogram P(du) of reduced Hamming distances dn between different 
solutions found by the DWF strategy for a single problem instance with constraint density 
a and pattern length N. Different pattern lengths of N = 101,501, 1001 are used, and 100 
different solutions are constructed by repeated running of the DWF process. Other problem 
instances show similar properties. We notice from Fig. [3] that, at the same value of a, the 
histograms P(dn) for different N are peaked at almost the same dn value, but the width 
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FIG. 3: (Color online) Histograms of reduced Hamming distances between solutions found by 
DWF on a single problem instance of M input patterns of length N. 100 solutions are constructed 
for each of the five instances with (M, N) = (45, 101), (70, 101), (125, 501), (225, 501), (250, 1001), 
respectively. The solid lines are Gaussian fitting results to the histograms. 



of P{dn) decreases as N is enlarged. Such a behavior was observed earlier in Ref. [25| on 
a slightly modified Ising perceptron problem. The solutions obtained by the DWF strategy 
therefore have a typical level of similarity. Figure [3] also demonstrates that, as the constraint 
density increases, the histograms P(dn) shift to smaller dn values, suggesting that the level of 
similarity between the DWF-constructed solutions increases with a. At a = 0.693 the typical 
reduced Hamming distance is dn ~ 0.224, compatible with the mean-field predictions (20| . 
Similar results are obtained for solutions found by the SWF strategy. In all our simulations, 
we do not observe double or multiple peaks for the histogram P(dn)- The results of these and 
our other numerical simulations (not shown) are consistent with proposal that, for a given 
problem instance, the solutions obtained by the random walking strategies are members of 
the same (large) solution cluster of the solution space BUSS]. Unlike the random K- 
satisfiabili ty p roblem, the random Q-coloring problem, or some locked constraint satisfaction 
problems [46l - 

II, 

the solution space organization of the Ising perceptron problem is still not 
very clear. Kabashima and co-authors [H| suggested that for a < 0.83 the solution space 
of the Ising perceptron problem is equally dominated by exponentially many clusters of 
vanishing entropy and a sub-exponential number of large clusters. Our simulation results 
are compatible with this proposal, but more work needs to be done to clarify the solution 
space structure of the random Ising perceptron problem. 

The total time T aN used by the DWF strategy to correctly classify the first aN patterns 
for a problem instance with N = 1001 is shown in Fig. S]as a function of a. The learning time 
grows almost linearly with a for a < 0.4. As the constraint density a becomes large, different 
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FIG. 4: (Color online) The learning time T a ^ as a function of a for for three problem instances 
of N = 1001. 



solution communities are expected to form in the solution space 47|. Then as a further 
increases to certain larger value, the time needed for the random walk process to escape 
from a solution community may exceed the preset maximal waiting time of At max = 1000 
and the DWF process will then stop. The achieved storage capacity a can be increased 
to some extent if we make At max larger, but the search process will become more and 
more viscous as the solution space of the problem becomes more and more heterogeneous 
and complex 34j. We do not attempt to calculate the jamming point of the random walk 
searching processes. 



V. DISCUSSION 



We proposed several stochastic learning str ateg ies for the Ising perceptron problem based 
on the idea of solution space random walking [341 ] . Our simulation results in Fig. [2] demon- 
strated that, the DWF strategy is able to correctly classify > 0.4iV random input patterns 
of length N for N < 1001. If a simple relearning strategy is added to the DWF strategy, the 
learning performance is further improved. The learning time of the DWF strategy grows 
roughly linearly with the number of input patterns. This work suggested that learning 
by local and random changes of synaptic weights is efficient for perceptronal systems with 
N 10 2 — 10 3 neurons. These local sequential learning strategies may be exploited in some 
biological perceptronal systems. In real neuronal systems, the number N of involved neurons 
in an elementary pattern classification task may be of the order of N ~ 10 1 — 10 3 . 

The solutions obtained by the DWF strategy for a given perceptronal learning task are 
separated by a typical Hamming distance, which reduces as the number of input patterns 
increases (Fig. [3]). However, solutions are still far away from each other even near to the 
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critical capacity. We suspected that for the problem instances studied in this paper, either 
the solution space of the problems is ergodic as a whole, or the solutions reached by the DWF 
strategies all belong to the same solution cluster of the solution space. In our random walking 
setting, once all weights are frozen, particularly for SWF, the current pattern with negative 
stability field will be no longer learned since the current weight configuration is isolated in 
the weight space (this weight configuration is denoted as the completely frozen solution); 
fortunately, DWF is able to go on even if all weights are frozen, since flipping certain pairs 
of weights is still permitted from the configuration where each single weight is not allowed 
to be flipped. If these flippable pairs of weights do not exist, DWF will get trapped, and 
the configuration is isolated once again. Actually, as the constraint density a increases, 
many such isolated solutions will show up, and SWF or DWF working by single- or double- 
weight flips, is not capable of crossing energy barriers separating the isolated solutions from 
those connected ones, which can be bypassed to some extent using the relearning strategy 
which helps to escape from these small clusters and makes SWF or DWF keep on exploring 
the large cluster composed of exponentially many solutions. For small a, replica symmetric 
ansatz is believed to give a good description of the solution space of Ising percetpron j25j. Up 
to a s , point-like clusters will form and searching for the compatible weights becomes more 



difficult |48| . It is desirable to have a theoretical understanding on the structural evolution of 
the solution space of the random Ising perceptron problem. How the dynamics of stochastic 
local search algorithms is influenced by the solution space structure of the random Ising 
perceptron is an important open issue. 

Another interesting problem is the generalization problem where the inputs-output asso- 
ciations are no longer uncorrelated but the desired outputs are given by a teacher perceptron 



171. |49|-|51|. The student perceptron tries to learn the rule provided by the teacher. After an 
enough amount of examples are presented to the student perceptron, the student's weights 
should match those of the teacher, then the network undergoes a first-order transition from 
poor to perfect generalization (49l . Sol. It is worthwhile to extend the current random walk 
strategies to analyze the generalization problem in Ising perceptrons. 
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